A geometric framework for outlier detection in high‐dimensional data
نویسندگان
چکیده
Outlier or anomaly detection is an important task in data analysis. We discuss the problem from a geometrical perspective and provide framework which exploits metric structure of set. Our approach rests on manifold assumption, that is, observed, nominally high-dimensional lie much lower dimensional this intrinsic can be inferred with learning methods. show exploiting significantly improves outlying observations high data. also suggest novel, mathematically precise widely applicable distinction between distributional structural outliers based geometry topology clarifies conceptual ambiguities prevalent throughout literature. experiments focus functional as one class structured data, but we propose completely general include image graph applications. results outlier non-tabular detected visualized using methods quantified standard scoring applied to embedding vectors. This article categorized under: Technologies > Structure Discovery Clustering Fundamental Concepts Data Knowledge Visualization
منابع مشابه
A Framework for Outlier Detection in Geographic Spatial Data
Outlier detection is very interesting, useful and challenging problem in the field of data mining. Because of sparse data clustering algorithm which are based on distance will not work to find outliers in spatial data. Problem of finding irregular feature in spatial data need to be explore. Many existing approaches have been proposed to overcome the problem of outlier detection in spatial Geogr...
متن کاملA statistical test for outlier identification in data envelopment analysis
In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...
متن کاملA Unified Subspace Outlier Ensemble Framework for Outlier Detection
$EVWUDFW 7KH WDVN RI RXWOLHU GHWHFWLRQ LV WR ILQG VPDOO JURXSV RI GDWD REMHFWV WKDW DUH H[FHSWLRQDO ZKHQ FRPSDUHG ZLWK UHVW ODUJH DPRXQW RI GDWD 'HWHFWLRQ RI VXFK RXWOLHUV LV LPSRUWDQW IRU PDQ\ DSSOLFDWLRQV VXFK DV IUDXG GHWHFWLRQ DQG FXVWRPHU PLJUDWLRQ 0RVW VXFK DSSOLFDWLRQV DUH KLJK GLPHQVLRQDO GRPDLQV LQ ZKLFK WKH GDWD PD\ FRQWDLQ KXQGUHGV RI GLPHQVLRQV +RZHYHU WKH RXWOLHU GHWHFWLRQ SUREOHP ...
متن کاملOutlier detection for skewed data
Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the Stahel-Donoho outlyingness. The latter approach assigns to each observation a measure of outlyingne...
متن کاملa framework for identifying and prioritizing factors affecting customers’ online shopping behavior in iran
the purpose of this study is identifying effective factors which make customers shop online in iran and investigating the importance of discovered factors in online customers’ decision. in the identifying phase, to discover the factors affecting online shopping behavior of customers in iran, the derived reference model summarizing antecedents of online shopping proposed by change et al. was us...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery
سال: 2023
ISSN: ['1942-4787', '1942-4795']
DOI: https://doi.org/10.1002/widm.1491